3 research outputs found

    FPGA-accelerated machine learning inference as a service for particle physics computing

    Full text link
    New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that potentially requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) milliseconds with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600--700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.Comment: 16 pages, 14 figures, 2 table

    Development of an FPGA Emulator for the RD53A Test Chip

    No full text
    Thesis (Master's)--University of Washington, 2019In 2024 the LHC will be shut down for an extended period to perform upgrades to the instrument and the detectors located on it. One such upgrade is the Front End ITk upgrade in ATLAS Pixel. New hardware is being developed for this upgrade, and a test chip called the RD53A has been created as a prototype version of that future hardware. Primarily because of restrictions associated with the RD53A chip, including its limited availability to researchers, its inability to generate realistic data without radiation present, and the speed at which changes to the physical chip can be made when issues are found, The Adaptive Computing Machines and Emulators (ACME) lab at the University of Washington built and maintains code to emulate the RD53A. The Emulator solves the problems posed by the physical chip: The code is downloadable from an online repository and runs on a commercially available FPGA and thereby is easily accessible to researchers; It generates realistic hit data without need for irradiation; and, when the Emulator needs to be altered to fix issues or target certain research applications, turnaround can be completed in hours, not weeks. The motivation for the Emulator, its current state of development, and my personal work on the architecture and logic are explained in this thesis

    CERN openlab Technical Workshop

    No full text
    We present Micron Inc. Deep Learning Accelerator (DLA), its software development kits, compiler and applications. We introduce current and future DLA versions, and plans for additional software tools and support. We also present a summary of current Micron collaboration and DLA-based activities with CERN
    corecore